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ABSTRACT 

Differences in clustering properties between galaxy subpopulations complicate the cosmo- 
logical interpretation of the galaxy power spectrum, but can also provide insights about the 
physics underlying galaxy formation. To study the nature of this relative clustering, we per- 
form a counts-in-cells analysis of galaxies in the Sloan Digital Sky Survey (SDSS) in which 
we measure the relative bias between pairs of galaxy subsamples of different luminosities 
and colours. We use a generalized test to determine if the relative bias between each pair 
of subsamples is consistent with the simplest deterministic linear bias model, and we also 
use a maximum likelihood technique to further understand the nature of the relative bias be- 
tween each pair We find that the simple, deterministic model is a good fit for the luminosity- 
dependent bias on scales above ~ 2 h^^Mpc, which is good news for using magnitude-limited 
surveys for cosmology. However, the colour-dependent bias shows evidence for stochasticity 
and/or non-linearity which increases in strength toward smaller scales, in agreement with pre- 
vious studies of stochastic bias. Also, confirming hints seen in earlier work, the luminosity- 
dependent bias for red galaxies is significantly different from that of blue galaxies: both lu- 
minous and dim red galaxies have higher bias than moderately bright red galaxies, whereas 
the biasing of blue galaxies is not strongly luminosity-dependent. These results can be used to 
constrain galaxy formation models and also to quantify how the colour and luminosity selec- 
tion of a galaxy survey can impact measurements of the cosmological matter power spectrum. 

Key words: galaxies: statistics - galaxies: distances and redshifts - methods: statistical - 
surveys - large-scale structure of Universe 



1 INTRODUCTION 

In order to use galaxy surveys to study the large-scale distribution 
of matter, the relation between the galaxies and the underlying mat- 
ter - known as the galaxy bias - must be understood. Developing 
a detailed understanding of this bias is important for two reasons: 
bias is a key systematic uncertainty in the inference of cosmologi- 
cal parameters from galaxy surveys, and it also has implications for 
galaxy formation theory. 

Since it is difficult to measure the dark matter distribution 
directly, we can gain insight by studying relative bias, i.e., the 
relation between the spatial distributions of different galaxy sub- 
populations. There is a rich body of litera ture on this subject 
tracing back many decades (see, e.g.. .Hubble & HumasonI 193ll: 



Davis & Gelleill974lHainiltonll988l : lwhite et alJl988l : |Park et al.l 



1994 iLovedav et all 1995h , and been studied extensively in recent 
years as well, both theoretically (Seljak 2001; van den B osch et al] 
l2003l ; ICoora^^l2005l ; ISheth etd.ll2005l ; ITinker et al.ll2007l) and ob- 
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servationally. Such studies have established that biasing depends 
on the type of galaxy under consideration - for example, early- 
typ e, red galaxies are more clustered than late-type, blue galax- 
ies jGuzzo et al.lll99llNorberg et al.ll2002l;lMadg wic k et aUboOSl; 
IConwav et"ai]|2005l ; iLi et al.ll2006l ; ICroton et al.ll2007b. and lumi- 
nous galaxi es are mo re clustered than dim galaxies ('willmer et al. 
1998 ; Norberg et al.l i2001; Tegmark et al 2004b; Zehavi et aL 
l2005l ; ISeliak et alj|2005l;lskibba et alj|2006l) . Since different types 
of galaxies do not exactly trace each other, it is thus impossible for 
them all to be exact tracers of the underlying matter distribution. 

More quantitatively, the luminosity dependence of bias has 
been measured in the 2 De g ree Field Galax y Redshift Survey (2dF- 
GRS; ICollessetal.ll200ll) jNorberg et ai]|2 001. 200j ) and in the 
Sloan Digital Sky Survey (SDSS; York et al. 2000; Stoughton et aL| 
|2002) (Tegmark et al. 2004b; Zehavi et al. 2005; Li et al. 2006) as 
well as other surveys, and it is generally found that luminous galax- 
ies are more strongly biased, with the difference becoming more 
pronounced above L,, the charact eristic luminosity of a galaxy in 
the Schechter luminosity function ( ISchechteJll976h . 

These most recent studies measured the bias from ratios of 
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correlation functions or power spectra. Tiie variances of clustering 
estimators like correlation functions and power spectra are well- 
known to be the sum of two physically separate contributions: Pois- 
son shot noise (due to the sampling of the underlying continuous 
density field with a finite number of galaxies) and sample vari- 
ance (due to the fact that only a finite spatial volume is probed). 
On the large scales most relevant to cosmological parameter stud- 
ies, sample variance dominated the aforementioned 2dFGRS and 
SDSS measurements, and therefore dominated the error bars on the 
inferred bias. 

This sample variance is easy to understand: if the power spec- 
trum of distant luminous galaxies is measured to be different than 
that of nearby dim galaxies, then part of this measured bias could 
be due to the nearby region happening to be more/less clumpy 
than the distant one. In this paper, we will eliminate this annoy- 
ing sample variance by comparing how different galaxies cluster 
in the same region of space, e x tending the cou nts-in- cells work 
of 'Tegm ark & BromlevI jl999l) . Imanton' (2000'), and Wild et al.l 
(2005) and the correl a tion function work of Norberg et al. ( 2001 ), 
iNorberg et al] ( |2002|) . IZehavi et al.1 f2005l) . and iLi et alj ^2009) . 
Here we use the counts-in-cells technique: we divide the survey 
volume into roughly cubical cells and compare the number of 
galaxies of each type within each cell. This yields a local, point- 
by-point measure of the relative bias rather than a global one as 
in the correlation function method. In other words, by comparing 
two galaxy density fields directly in real space, including the phase 
information that correlation function and power spectrum estima- 
tors discard, we are able to provide substantially sharper bias con- 
straints. 

This local approa ch a l so en a bles us to quantify so- 
called stochast ic bias ' 19981; iTegmark & Peebles! Il998l : 
lDekel&Lahav|[l99 9: Matsubara 1991). It is well-known that the 
relation between galaxies and dark matter or between two dif- 
ferent types of galaxies is not necessarily deterministic - galaxy 
formation processes that depend on variables other than the lo- 
cal matter density give rise to stocha stic bias as described in 
|Per3 (1998), Tegmark & PeeblesI ( Il998l) . Dekel & Lahav {\99%, 
and iMatsubaral ( Il999l) . Evidence for stochasticity in the relative 
bi as between early-t y pe and late-typ e gal axies has been presented 
Wild et alj dgOOSh. IConwav et al.l m05). Tegmark & Bromlev 



ing precise cosmologic al inferenc es with the next generation of 
galaxy redshift survey s i Percival et al. 2004; Abazajian et al. 2003; 
Percival et al.ll2007l; IZheng & Weinberg., 2007. ; .Moller et al.i,2007l ; 
Kristiansen et alj|2007l) . 



( 11999 ^. and'Bl antonl ( l2000l) . Additionallv. iSimon et al.l ( l2OO70 finds 
evidence for stochastic bias between galaxies and dark matter 
via weak lensing . The time evolution of su ch stochastic bias has 
been modelled in Tegmark & Peeble3 jl998l) and was recently up- 



dated in ISimonI Jiom Stochasticity is even predicted in the rel- 
ative bias between virialized clumps of dark matter (hal oes) and 
the linearly-evolved dark ma tter distribution ( C asas-Miranda et al.l 
l2002l ; ISeliak & Warrerj2004l) . Here we aim to test whether stochas- 
ticity is necessary for modelling the luminosity-dependent or the 
colour-dependent relative bias. 

In this paper, we study the relative bias as a function of 
scale using a simple stochastic biasing model by comparing pairs 
of SDSS galaxy subsamples in cells of varying size. Such a 
study is timely for two reasons. First of all, the galaxy power 
spectrum has recently been measured to high pre cision on large 
scales with t h e goal of constraining cosmology |Tegmark et al.l 
l2004bl . l2006l ; iBlake et al.l |2007| ; IPadmanabhan et al.ll2007l) . As 
techniques continue to improve and survey volumes continue to 
grow, it is necessary to reduce systematic uncertainties in order 
to keep pace with shrinking statistical uncertainties. A detailed 
understanding of complications due to the dependence of galaxy 
bias on scale, luminosity, and colour will be essential for mak- 



Secondly, a great deal of theoretical progress on models of 
galaxy formation has been made in recent years, and 2dFGRS and 
SDSS contain a large enough sample of galaxies that we can now 
begin to place robust and detailed observational const raints on these 
models. The framework known as the halo model jSeliakll2000l) 
(see ICoorav & Sh eth 2002 for a comprehensive review) provides 
the tools needed to make comparisons between theory and observa- 
tions. The halo model assumes that all galaxies form in dark matter 
haloes, so the galaxy distribution can be modell ed by first determin- 
ing the halo distribution - either a nalytically dCatelan et al.lll99^ ; 
'McDonal(^'2006'; Smith et al. 2007) or using A^-body simulations 
tSmith et al. 2003 ; Kravtsov et al. 2004| ; lKravtsovl2006h - and then 
populating the haloes with galaxies. This second step can be done 
using semi-analytical galaxy formation models ('Somervil le et al.l 
[2001; Beiiind et al. 2003; Croton et al. 2006; Baugh 2006) or with 
a statistical approach using a model for the halo occupation distri- 
bution (HO D) (Peacock & Smith.200a ; ,Berlind & Weinberg 20021 ; 
ISefusatti"& S coccimarro 2 00^ or conditional lumino sity function 
(CLF) dYangetalj i2003) ; Ivan den Bosch et all l2003h which pre- 
scribes how galaxies populate haloes. 

Although there are some concerns that the halo model does 
not capture all of the rele vant physics dYang et alj[20o3 ; ICoora'^d 
I2OO6I ; lGao& Whil3l2007l) . it has been appUed successfully in a 
number of different contexts ( Scranton 2 0031 ; ICoUister & Lahavl 



tween a galaxy's environment (i.e., the loc al density of surro unding 
galaxies) and its colour and luminosity ( iHogg et alj|2004l ; Blan- 
ton et al. 2005a) has been i nterpreted in the context of the halo 
model (Berlind et al. 2005; Blanton et al. 2006; Abbas & Sheth| 
I2OO6), and va n den Bosch et al. (2003.) and Cooray ( 2005.) make 
predictions for the bias as a function of gala xy type and l u minos - 
ity using the CLF forma lism. Additionally, 'Zehavi et alj 1 2005h . 
iMagliocchetti & Porcianl {2003), and Abbas & Shethi d2005l) use 
correlation function methods to study the luminosity and colour 
dependence of galaxy clustering, and interpret the results using 
the Halo Occupation Distribution (HOD) framework. The analysis 
presented here is complementary to this body of work in that the 
counts-in-cells method is sensitive to larger scales, uses a differ- 
ent set of assumptions, and compares the two density fields directly 
in each cell rather than comparing ratios of their second moments. 
The halo model provides a natural framework in which to interpret 
the luminosity and colour dependence of galaxy biasing statistics 
we measure here. 

The rest of this paper is organized as follows: Section [2] de- 
scribes our galaxy data, and Sections [3.11 and [T2l describe the con- 
struction of our galaxy samples and the partition of the survey vol- 
ume into cells. In Section [331 we outline our relative bias frame- 
work, and in Sections [3.4| and [33| we describe our two main analy- 
sis methods. We present our results in Section|4]and conclude with 
a qualitative interpretation of our results in the halo model context 
in SectionjS] 



2 SDSS GALAXY DATA 



The SDSS dYork et al.l [2OO0I; IStoughton et al.l [20o3) uses a mo- 
saic CCD camera ( lGunnetal.lll998l) on a dedicated telescope 
dGunn et al.l l2006h to image the sky in five photometric band- 
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passes denoted u, g, r, i and z I Fukugita et al.|[l993) . After as- 
trometric calibration i Pier et ai] l2003l) . photometric data reduc- 
tion |LuDton et al|200ll) , and photometric calibration jHogg et alJ 
I2OO1I ; ISmith et alj I2OO2I ; llvezic et alJ |2004| : iTucker et alj l2006h . 



galaxies are selected for spectroscopic observations. To a good ap- 
proximation, the main galaxy sample consists of all galaxies with 
r-band apparent Pe trosian magnitude r < 17.77 after correction 
for reddening as per lSchlegel et alj ( ll998l) : there are about 90 such 
galaxies per square degree, with a median redshift of 0.1 and a 
tail out to z ~ 0.25. Galaxy s pectra are also measu red for the 
Luminous Red Galaxy sample jEisenstein et al.ll200lh . which is 
not used in this paper. These targets are assigned to spectroscopic 
plates of diameter 2.98° by an adaptive tiling algorithm (Blan- 
ton et al. 2003b) and observed with a pair of CCD spectrographs 
dUomoto et alj|2004l) . after which the spectroscopic data reduction 
and redshift determination are performed by automated pipelines. 
The rms galaxy redshift errors are of order 30kms~^ for main 
galaxies, hence negligible for the purposes of the present paper. 

Our analysis is based on 380,614 main galaxies (the 'safeO' 
cut) from the 444,189 galaxies in the 5th SDSS data release ('DR5') 
jAdelman-McCarthv etal] l2007h . processed via the SDSS data 
repository at New York University (Blanton et al. 2005b). The de- 
tails of how these samples were processed and modelled are gi ven 
in App endix A of Tegmark et al. (2004b) and in Eisenstein et al.| 
1 I2OO5I) . The bottom line is that each sample is completely specified 
by three entities: 

(i) The galaxy positions (RA, Dec, and comoving redshift space 
distance r for each galaxy) ; 

(ii) The radial selection function n (r), which gives the expected 
(not observed) number density of galaxies as a function of distance 

(iii) The angular selection function n (f), which gives the com- 
pleteness as a funct ion of direction in the sky, s pecified in a set of 
spherical polygons jHamilton & Tegmarkll2004h . 

The three-dimensional selection functions of our samples are sep- 
arable, i.e., simply the product n (r) — n (f ) n (r) of an angu- 
lar and a radial part; here r = \r\ and r = r/r are the radial 
comoving distance and the unit vector corresponding to the po- 
sition r. The volume-limited samples used in this paper are con- 
structed so that their radial selection function n (r) is constant over 
a range of r and zero elsewhere. The effective sky area covered 
is n = J n (f) di} ~ 5183 square degrees, and the typical com- 
pleteness n (f) exceeds 90 per cent. The conversion from redshift 
z to comoving distance was made for a flat ACDM cosmological 
model with flm ~ 0.25. Additionally, we make a first-order cor- 
rection for redshift space distortions by applying the finger -of-god 
compression algorithm described in lTegmark et alj ( l2004bh with a 
threshold density of Sc = 200. 



3 ANALYSIS METHODS 

3.1 Overlapping Volume-Limited Samples 

The basic technique used in this paper is pairwise comparison of 
the three-dimensional density fields of galaxy samples w ith differ- 
ent colours and luminosities. As in lZehavi et alj ( l2005h . we focus 
on these two properties (as opposed to morphological type, spec- 
tral type, or surface brightness) for two reasons: they are straight- 
forward to measure from SDSS data, and recent work (Blanton et 
al. 2005a) has found that luminosity and colour is the pair of prop- 
erties that is most predictive of the local overdensity. Since colour 




10 10 
Comoving distance r IVIpc) 

Figure 1. Histogram of the comoving number density (after finger-of-god 
compression) of the volume-limited samples L1-L7. The cuts used to define 
these samples are shown in Table[T] Note that the radial selection function 
n (r) is uniform over the allowed range for each sample. Aitows indicate 
volumes V1-V6 where neighbouring volume-fimited samples overlap. 



and spectral type are strongly correlated, our study of the colour 
dependence of bias probes similar physics as studies using spec- 
tral tvpe (Tegmark & Bromley 1999; Blanton 2000; Norberg et aj] 
120021 ; iwild et al.ll2005i ; IConwav et al.ll2005i) . 

Our base sample of SDSS galaxies ('safeO') has an r-band ap- 
parent magnitude range of 14.5 < r < 17.5. Following the method 
used in iTegmark et al.l ( l2004bl) . we created a series of volume- 
limited samples containing galaxies in different luminosity ranges. 
These samples are defined by selecting a range of absolute magni- 
tude i\fiuminoua < Mo.i^ < Mdim and defining a redshift range 
such that the near limit has Mo.i^ = Miuminous, r = 14.5 and 
the far limit has Mo.i,, = Mdim, r = 17.5. Thus by discarding 
all galaxies outside the redshift range, we are left with a sample 
with a uniform radial selection function n (r) that contains all of 
the galaxies in the given absolute magnitude range in the volume 
defined by the redshift limits. Here AIo.i^ is defined as the absolute 
magnitude in the r-band shifted to a redshift of 2; = 0.1 (Blanton 
et al. 2003a). 

Our volume-limited samples are labelled LI through L7, with 
LI being the dimmest and L7 being the most luminous. Figure [T] 
shows a histogram of the comoving galaxy density n (r) for Ll- 
L7. The cuts used to make these samples are shown in Table[T] 

Each sample overlaps spatially only with the samples in neigh- 
bouring luminosity bins - since the apparent magnitude range spans 
three magnitudes and the absolute magnitude ranges for each bin 
span one magnitude, the far redshift limit of a given luminosity 
bin is approximately equal to the near redshift limit of the bin two 
notches more luminous. (It is not precisely equal due to evolution 
and K-corrections.) 

The regions where neighbouring volume-limited samples 
overlap provide a clean way to select data for studying the 
luminosity-dependent bias. By using only the galaxies in the over- 
lapping region from each of the two neighbouring luminosity bins. 
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Table 1. Summary of cuts used to create luminosity-binned volume-limited samples. 



Luminosity-binned 










Comoving number 


volume- limited samples 


Absolute maa 


nitude 


Redshift 


density n (/I'^Mpc^'^) 


Ll 


-17 < Mo.i^ 


< 


-16 


0.007 <z< 0.016 


(1.63 ±0.05) X 10^2 


L2 


-18 < Mo.lr 


< 


-17 


0.011 <z < 0.026 


(1.50 ±0.03) X 10-2 


L3 


-19 < Mo.i^ 


< 


-18 


0.017 <z< 0.041 


(1.23 ±0.01) X 10-2 


L4 


-20 < Mo.i^ 


< 


-19 


0.027 <z< 0.064 


(8.86 ±0.05) X 10-3 


L5 


-21 < Mo.i^ 


< 


-20 


0.042 <z < 0.100 


(5.02 ±0.02) X 10-3 


L6 


-22 <Mo.i^ 


< 


-21 


0.065 <z< 0.152 


(1.089 ±0.005) X 10-3 


L7 


-23 < Mo.i^ 


< 


-22 


0.101 < ^ < 0.226 


(4.60 ± 0.06) X 10-'"^ 



Table 2. Overlapping volumes in which neighbouring luminosity bins are 
compared. 



Pairwise comparison 


Overlapping 




(overlapping) volumes 


bins 


Redshift 


VI 


Ll &L2 


0.011 <z< 0.016 


V2 


L2&L3 


0.017 <z < 0.026 


V3 


L3&L4 


0.027 <z< 0.041 


V4 


L4&L5 


0.042 <z < 0.064 


V5 


L5&L6 


0.065 <z< 0.100 


V6 


L6&L7 


0.101 < z < 0.152 



we obtain two sets of objects (one from the dimmer bin and one 
from more luminous bin) whose selection is volume-limited and 
redshift-independent. Furthermore, since they occupy the same vol- 
ume, they are correlated with the same underlying matter distri- 
bution, which eliminates uncertainty due to sample variance and 
removes po tential sy st ematic effects due to sampling different vol- 
ume sizes jjovce et al.ll2005h . We label the overlapping volume re- 
gions VI through V6, where VI is defined as the overlap between 
Ll and L2, and so forth. The redshift ranges for V1-V6 are shown 
Tabled 

To study the colour dependence of the bias, we further divide 
each sample into red galaxies and blue galaxies. Figure |2] shows 
the galaxy distribution of our volume-limited samples on a colour- 
magnitude diagram. The sharp boundaries between the different 
horizontal slices are due to the differences in density and total 
volume sampled in each luminosity bin. This diagram illustrates 
the well-known colour bimodality, with the redder galaxies falling 

i iredominantly i n a region commonly kn own as the E -SO ridgeline 
Baloghetal.12 004: Mateus et al. 2006: B aldrv et al.l20 06). To sep- 
arate the E-SO ridgeline from the rest of the p opulation, we use th e 
same magnitude-dependent colour cut as in IZehavi et alj ( 120051) : 
we define galaxies with (g - r) < 0.9 - 0.03 (Mo.i^ + 23) to 
be blue and galaxies on the other side of this line to be red. 

In each volume V1-V6, we make four separate pairwise com- 
parisons: luminous galaxies vs. dim galaxies, red galaxies vs. blue 
galaxies, luminous red galaxies vs. dim red galaxies, and luminous 
blue galaxies vs. dim blue galaxies. The luminous vs. dim com- 
parisons measures the relative bias between galaxies in neighbour- 
ing luminosity bins, and from this we can extract the luminosity 
dependence of the bias for all galaxies combined and for red and 
blue galaxies separately. The red vs. blue comparison measures the 
colour-dependent bias. This set of four different types of pairwise 
comparisons is illustrated in Fig.[3]for V4, and the number of galax- 
ies in each sample being compared is shown in Table[3] 

3.2 Counts-in-Cells Methodology 

To compare the different pairs of galaxy samples, we perform a 
counts-in-cells analysis: we divide each comparison volume into 




Figure 2. Colour-magnitude diagram showing the number density distribu- 
tion of the galaxies in the volume-limited samples. The shading scale has a 
square-root stretch, with darker areas indicating regions of higher density. 
The line shows the colour cut of O ^ {g - r) = 0.9 - 0.03 (A/o.i^ + 23). 
We refer to galaxies falling to the left of this line as blue and ones falling to 
the right of the line as red. 



Table 3. Number of galaxies in each sample being compared. 





All split by luminosity 


All split by colour 




Luminous 


Dim 


Red 


Blue 


VI 


427 


651 


125 


953 


V2 


2102 


2806 


1117 


3791 


V3 


6124 


8273 


5147 


9250 


V4 


12122 


23534 


17144 


18512 


V5 


11202 


53410 


37472 


27140 


V6 


1784 


38920 


27138 


13566 




Red split by luminosity 


Blue split by luminosity 




Red luminous 


Red dim 


Blue luminous 


Blue dim 


VI 


72 


53 


355 


598 


V2 


620 


497 


1482 


2309 


V3 


2797 


2350 


3327 


5923 


V4 


6848 


10296 


5274 


13238 


V5 


7514 


29958 


3688 


23452 


V6 


1451 


25687 


333 


13233 
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Figure 3. Galaxy distributions (after finger-of-god compression) plotted in comoving spatial co-ordinates for a radial slice of the volume-limited samples L4 
(smaller dots, radial boundaries denoted by dashed lines) and L5 (larger dots, radial boundaries denoted by solid lines), which overlap in volume V4. Four 
different types of pairwise comparisons are illustrated: (a) luminous galaxies (L5) vs. dim galaxies (L4), (b) red galaxies vs. blue galaxies (both in V4), (c) 
luminous red galaxies (L5) vs. dim red galaxies (L4), and (d) luminous blue galaxies (L5) vs. dim blue galaxies (L4). The shaded regions denote the volume 
in which the two sets of galaxies are compared. A simple visual inspection shows that the different samples of galaxies being compared generally appear to 
cluster in the same physical locations - one key question we aim to answer here is if these observed correlations can be described with a simple linear bias 
model. 



roughly cubical cells and use the number of galaxies of each type 
in each cell as the primary input to our statistical analysis. This 
method is complementary to studies based on the correlation func- 
tion since it involves point-by-point comparison of the two density 
fields and thus provides a more direct test of the local deterministic 
linear bias hypothesis. We probe scale dependence by varying the 
size of the cells. 



To create our cells, we first divide the sky into two- 
dimensional 'pixels' at four different angular resolutions using 
the SDSSPix pixelization schem^ as implemented by an up- 
dated version of the angular mask processing software MANGLE 



See http://lahmu.phyast.pitt.edu/ scranton/SDSSPix/ 
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jHamilton & Tegmarkl2004l : ISwanson et al .'200'/). The angular se- 
lection function n (f) is averaged over each pixel to obtain the 
completeness. To reduce the effects of pixels on the edge of the 
survey area or in regions affected by internal holes in the survey, 
we apply a cut on pixel completeness: we only use pixels with a 
completeness higher than 80 per cent (50 per cent for the lowest an- 
gular resolution). Figure|4]shows the pixelized SDSS angular mask 
at our four different resolutions, including only the pixels that pass 
our completeness cut. The different angular resolutions have 15, 
33, 157, and 901 of these angular pixels respectively. At the lowest 
resolution, each pixel covers 353 square degrees, and the angular 
area of the pixels decreases by a factor of 1/4 at each resolution 
level, yielding pixels covering 88, 22, and 5 square degrees at the 
three higher resolutions. 

To produce three-dimensional cells from our pixels, we divide 
each comparison volume into radial shells of equal volume. We 
choose the number of radial subdivisions at each angular resolution 
in each comparison volume such that our cells are approximately 
cubical, i.e., the radial extent of a cell is approximately equal to 
its transverse (angular) extent. This procedure makes cells that are 
not quite perfect cubes - there is some slight variation in the cell 
shapes, with cells on the near edge of the volume slightly elongated 
radially and cells on the far edge slightly flattened. We state all of 
our results as a function of cell size L, defined as the cube root of 
the cell volume. At the lowest resolution, there is just 1 radial shell 
for each volume; at the next resolution, we have 3 radial shells for 
volumes V4 and V5 and 2 radial shells for the other volumes. There 
are 5 radial shells at the second highest resolution, and 10 at the 
highest. 

Since each comparison volume is at a different distance from 
us, the angular geometry gives us cells of different physical size 
in each of the volumes. At the lowest resolution, where there is 
only one shell in each volume, the cell size is 14/i~^Mpc in VI 



and 134 /i~^Mpc in V6. At the highest resolution, the cell size is 
1.7 /i~^Mpc in VI and 16 /i~^Mpc in V6. Figure[5]shows the cells 
in each volume V1-V6 that are closest to a size of ~ 20/i-iMpc, 
the range in which the length scales probed by the different volumes 
overlap. (These are the cells used to produce the results shown in 
Fig. [8]) 



3.3 Relative Bias Framework 

Our task is to quantify the relationship between two fractional over- 
density fields 5i [x) = pi [x) lp\ — \ and ii (x) = p2{x) /p2 — 1 
representing two different types of objects. This framework is com- 
monly used with types (1,2) representing (dark matter, galaxies), or 
as in Blanton ( 2000), Wild et al. (2005), and Co nway et al . (2005]), 
(early-type galaxies, late-type galaxies). Here we use it to repre- 
sent (more luminous galaxies, dimmer galaxies) or (red galaxies, 
blue galaxies) to compare the samples described in Section 13.11 
Galaxies are of course discrete objects, and as customary, we use 
the continuous field pa {x) (where a=l or 2) to formally refer to 
the expectation value of the Poisson point process involved in dis- 
tributing the type a galaxies. 

The simplest (and frequently assumed) relationship between 
Si and ^2 is linear deterministic bias: 

52 {x) = blinSl {x) , (1) 

where bun is a constant parameter This model cannot hold in all 
cases - note that it can give negative densities if foun > 1 - but 
is typically a reasonable approximation on cosmologically large 
length scales where the density fluctuations Si <^ 1, as is the case 
for the measurements of the large scale po wer spectrum recently 
used to constrain cosmological param eters dTegmark et alj|2004bl : 
ICole et al.ll2005l : lTegmark et alj2006l) 
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Figure 5. A radial slice of the SDSS survey volume divided into cells of size 20h ^Mpc with the galaxies in each cell (after finger-of-god compression) 
shown in grey. 



More complicated model s allow for non-l i nearit y and stochas- 
ticity, as described in detail in iDekel&Lahavlj 19991) : 



52 (x) =b[5i {x)]5i {x)+e{x) , 



(2) 



where the bias h is now a (typically slightly non-linear) function 
of 5i. The stochasticity is represented by a random field e - allow- 
ing for stochasticity removes the restriction implied by determin- 
istic models that the peaks of 5\ and 82 must coincide spatially. 
Stochasticity is basically the scatter in the relationship between the 
two density fields due to physical variables besides the local matter 
density. Non-local galaxy fo rmation processes can also give rise to 
stochasticity, as discussed in lMatsub ara ( 1999). 

We estimate the overdensity of galaxies of type a in cell i by 



(0 



(i) 



(3) 



where Na'^ is the number of observed type a galaxies in cell i 
and Na^ is the expected number of such galaxies, computed from 
the average angular selection function in the pixel and normal- 
ized so that the sum of TV^'' over all cells in the comparison vol- 
ume matches the total number of observed type a galaxies. The 
n-dimensional vectors 



(4) 



contain the counts-in-cells data to which we apply the statistical 
analyses in Sections [3.4l and l33] 

The covariance matrix of g is given by 



(5) 



where 5a^ is the average of 5a {x) over cell i and, making the 
customary assumption that the shot noise is Poissonian, the shot 
noise covariance matrix Nq, is given by 



m=5,,/N, 



(i) 



(6) 



For comparing pairs of different types of galaxies, we construct the 
data vector 



which has a covariance matrix 



C=(gg^) ^S + N 



with 



Ni 
N2 

and the elements of the matrix S given by 



511 Si 

512 S2 



(7) 



(8) 



(9) 



(10) 



The diagonal form of N in equation ^ assumes that there are no 
correlations between the shot noise of type 1 and type 2 galaxies 
within a given cell i - this means that the two galaxy distributions 
are treated as independent Poisson processes that sample related 
density distributions 5i (x) and 52 (x). Although one might expect 
the fact that the counts of type 1 and type 2 galaxies in a cell is 
constrained to be equal to the total number of galaxies in the cell 
could induce correlations in the shot noise, we do not explicitly use 
the combined total count in our analyses - uncorrelated shot noise 
is thus a reasonable assumption. 

Regarding the matrix S, other counts-in-cells analyses of- 
ten assume that the correlations between different cells can be 
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ignored, i.e., (^5^ j ~ unless i — j. Here we ac- 
count for cosmological correlations by computing the elements 
of S using the best-fitting ACDM matter power spectrum as we 
will now describe in detail. The power spectrum Pq^ (fc) is de- 
fined as (^L (fc) S,3 (fc')''") = (27r)^ 5° (fc - k') P^^p (fe), where 

5a (k) = J e~''^''"5a {x) cfix is the Fourier transform of the over- 
density field. Pii (fe) and P22 (k) are the power spectra of type 1 
and 2 galaxies respectively, and P12 (fc) is the cross spectrum be- 
tween type 1 and 2 galaxies. We assume isotropy and homogeneity, 
so that Pap (fc) is a function only of A; = | fc | , and rewrite the galaxy 
power spectra in terms of the matter power spectrum P (fc): 



bi (fc), 62 (fc), and ri2 (fc) vary slowly in fc. These parameters 
are closely related to those in the biasing models specified in 
equations (TJ and ©I If linear deterministic biasing holds, then 
6rci ~ 6iin and rrci = 1, and the addition of either n on-linearity 
or stochasticity will give r^ci < 1. As discussed in iMatsubaral 
( ,199 9). stochasticity is expected to vanish in Fourier space (i.e., 
ri2 (fc) = 1) on large scales where the density fluctuations are 
small, but scale dependence of &i (fc) and 62 (fc) can still give rise to 
stochasticity in real space. We will measure the parameters 6rci (L) 
and Troi (L) as a function of scale, thus testing whether the bias 
is scale dependent and determining the range of scales on which 
linear biasing holds. 



Pii (fc) = bi {kf P (fc) 

P12 (fc) = 61 (fc) &2 (fc) ri2 (fc) P (fc) 

P22(fc) = b2{kfP{k), (11) 

which defines the functions bi (fc), 62 (fc), and ryi (fc)- 

To calculate (^Sa'' S'^p^ exactly, we need to convolve 5a(x) 
with a filter representing cell i and &p (x) with a filter representing 
cell j. This is complicated since our cells, while all roughly cubical, 
have slightly different shapes. We therefore use an approximation 
of a spherical top hat smoothing filter with radius R: w (r, R) = 
3/(47rP"^)0(P — r) with the Fourier transform given by 

3 



w (fc,P) 



{kRY 



[sin(fcP) - fcPcos (fcP)]. 



(12) 



R is chosen so that the effective scale corresponds to cubes with 
side length L: R = \/5/12 L, where 1/ is the cell size defined 
in Section [3.21 (See p. 500 in iPeacocldll999l for derivation of the 
■^5/12 factor.) This gives 



1 

2^ 



sin {kri 



-Pais (fc) |w(fc,P)r fcMk, 



(13) 

where rij is the distance between the centres of cells i and j. The 
kernel of this integrand - meaning everything besides Pais ik) here 
- typically peaks at fc ~ 1/P and is only non-negligible in a range 
of A logj^o fc ~ 1. Assuming that the functions 61 (fc), 62 (fc), and 
ri2 (fc) vary slowly with fc over this range, they can be approxi- 
mated by their values at fcpcak = 1/P = ^12/5/L and pulled 
outside the integral, allowing us to write 



S = al (L) 



Sm 

(L) Trcl (L) Sm 



brcl (L) Trcl (L) Sm 

brci (L) Sm 

(14) 

where (L) = bi (fcpcak) , &rci (L) = &2 (fcpcak) /bi (fcpcak), 
rrci (L) = ri2 (fcpcak), and Sm is the correlation matrix for the 
underlying matter density: 



1 

2^ 



sin (fcr 



kvi 



P{k)\w{k,R)f k'^dk. 



(15) 



For t he matter power sp ectrum P (fc), we use the fitting formula 
from iNovosvadlvi et al. IT999) with the best-fitting 'vanilla' pa- 
rameters from iTegmark e t al. (2004a) and apply the non-linear 
transformation of lSmith e t al. (2003). 

Our primary parameters are the relative bias factor fcrci (L), 
the relative cross-correlation coefficient r-rd {L), and the over- 
all normalization af (L). The only assumptions we have made 
in defining these parameters are homogeneity, isotropy, and that 



3.4 The Null-buster Test 

Can the relative bias between dim and luminous galaxies or be- 
tween red and blue galaxies be explained by simple linear deter- 
ministic biasing? To addre ss this question, we use th e so-called 
null-buster test described in lTegmark & Bromlevl i 199ft) . For a pair 
of different types of galaxies, we calculate a difference map 



^9 = 92- f9i 



(16) 



for a range of values of /. If equation ([T) holds and / — bun, then 
the density fluctuations cancel and Ag will contain only shot noise, 
with a covariance matrix (AgAg"^) = Na = N2 + /^Ni - this is 
our null hypothesis. 

If equation l[T]l does not hold and the covariance matrix is in- 
stead given by ( AgAg'^) = Na + Sa, where Sa is some residual 
signal, then the most powerful test for ruling out the null hypothesis 
is the generalized statistic dTegmark & Peebleslll998l) 



Ag^N^^SAN^Mg - Tr (N^^Sa) 



[2Tr(N^iSAN^iSA) 



1/2 



(17) 



which can be interpreted as the significance level (i.e. the number of 
's igmas') at which we can rule out the null hypothesis. As detailed 
in lTegmarkl ( ll999h . this test assumes that the Poissonian shot noise 
contribution can be approximated as Gaussian but makes no other 
assumptions about the probability distribution of Ag. It is a valid 
test for any choice of Sa and reduces to a standard test if Sa ~ 
Na, but it rules out the null hypothesis with maximum significance 
in the case where Sa is the true residual signal. 

Using equations (|9]l, ( II Oi l, and the covariance matrix of 
Ag can be written as 



AgAg^ 



2 / r2 
"\ [f 



26rcir,ci/ + bL)SA/ + NA, (18) 



where Sm is given by equation dlSt . We use Sa = Sm in equa- 
tion jni (note that f is independent of the normalization of S, 
which scales out) since deviations from linear deterministic bias 
are likely to be correlated with large-scale structure. 

To apply the null-buster test, we compute as a function of / 
and then minimize it. If the minimum value Vmin > 2, we rule out 
linear deterministic bias at > 2a. If the null hypothesis cannot be 
ruled out and we choose to accept it as an accurate description of 
the data, we can use the value of / that gives Umin as a measure of 
bvci- 

We calculate the uncertainty on &rci using two different meth- 
ods. The first method makes use of the fact that 1/ is generalized 



gives 



^ 1, where is the number of degrees of 



freedom (equal to the number of cells minus I fitted parameter). 
This is a generalization of the standard Ax^ ~ 1 uncertainty since 
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1/ is a generalization of (x^ — N'^ /V^N. The second method uses 
jackknife resamphng, which is described in Section lBTI along with 
a comparison of the two methods. We present all of our results 
derived from the null-buster test using the uncertainties from the 
jackknife method. 



3.5 Maximum Likelihood Method 

In addition to the null-buster test, we use a maximum likelihood 
analysis to determine the parameters b^^i and r^^i- Our method is 
a generalization of the maximum likelihood method used in previ- 
ous papers, accounting for correlations between different cells but 

making a somewhat dif f erent set of assump t ions. 

In iBlantonI ilOOd) . IWild et al.l ilOoj) . IConwav et all ( l2005h . 
the probability of observing A^i galaxies of type I and A'2 galaxies 
of type 2 in a given cell is expressed as 



/oo roc 
J Poiss[iVi,iVi(l + 5i)] 

xPoiss[iV2,iV2(l + 52)] f{5i,S.,,a)d5idS2, (19) 

where / {5i, $2, a) is the joint probability distribution of Si and 
^2 in one cell, a represents a set of parameters which depend on 
the biasing model, and Poiss (A'^, A) = X'^ e~'^ /N\ is the Poisson 
probability to observe A'^ objects given a mean value A. The likeli- 
hood function for n cells is then given by 



£(a) = np(ivW,7V«) 



(20) 



which is minimized with respect to the parameters a. This treat- 
ment makes two assumptions: it neglects correlations between dif- 
ferent cells and it assumes that the galaxy discreteness is Poisso- 
nian. These assumptions greatly simplify the computation of C, but 
are understood to be approximations to the true process. Cosmo- 
logical correlations are known to exist on large scales, although 
their impact on co unts-in-cells analyses has been argued to be small 
teroadhurst etal.l [l 995: Conwav et al. 200^ Semi-analytical 
modelUng jSheth & Diaferio 2001: Berlin d & Weinberg|2002[). N - 
body simulation JCasas-Miranda et al. 20()J Kravtsoy et alj2004 ), 
and smoothed particle hydrodynamic simulation jSerlind et al.l 
l2003h investigations suggest that the probability distribution for 
galaxies/haloes is sub-Poissonian in some regimes, and in fact 
non-Poissonian behaviour is imp lied by observations as well 
jYang et al.ll2003l: IWild et al.ll2005l) . 

Dropping these two assumptions, we can write a more general 
expression for the likelihood function for n cells: 



'2 J • ■ • 1 ^'2 



Cia,l3) = p[N['\...,N["\N, 

n 



b=l 



.d(5, 



(n) 



(21) 



xdS['\..dS[" 

where Poiss (A'^, A) has been replaced with a generic probability 
Pg {N, A, P) for the galaxy distribution parameterized by some 
parameters f3 and / (^S[^^ , ■ ■ ■ , S["^ , ^j^' , . . . , ^j"' ,0;^ is a joint 



probability distribution relating Si and ^2 in all cells. In prac- 
tice, this would be pr ohibitively difficult to calculate as it in- 
volves 2n integrations jPodelson et alj|T997l) . and would require 
a reasonable parameterized form for Pg {N, A, (3) as well as 



In this paper, we take a simpler approach and approximate the 
probability distribution for our data vector g to be Gaussian with 
the covariance matrix C as defined by equations ([8}, (|9} and il4i . 
and use this to define our likelihood function in terms of the param- 
eters af, brci, and r^^i- 



ICTi, Orel, rrclj 



{") (1) 
,9i ,92 . 



(n) 
,92 



(27r)"|C|'/'^ 



exp 



-Vc- 



(22) 



Note that this includes the shot noise since C = S + N, and is not 
precisely equivalent to assuming that Pg and / in equation J21b are 
Gaussian. 

For r-rci values of |rrci| > 1, the matrix C is singular, and thus 
the likelihood function cannot be computed. Hence this analysis 
method automatically incorporates the constraint that jrrcij ^5 1, 
which is physically expected for a cross-correlation coefficient. 

To determine the best fit values of our parameters for each 
pairwise comparison, we maximize 2lnC (^af, ferci, '"roi) with re- 
spect to af, fcrci, and rroi. Since our method of comparing pairs 
of galaxy samples primarily probes the relative biasing between 
the two types of galaxies, it is not particularly sensitive to af, 
which represents the bias of type 1 galaxies relative to the dark 
matter power spectrum used in equation l |15l l. Thus we marginal- 
ize over al and calculate the uncertainty on brci and rrci using the 
A (21n£) — 1 contour in the broi-^rci plane. This procedure is 
discussed in more detail in SectionlB2l 



4 RESULTS 

4.1 Null-buster Results 

To test the deterministic linear bias model, we apply the null-buster 
test described in Section 13.41 to the pairs of galaxy samples de- 
scribed in Section im For studying the luminosity-dependent bias, 
we use the galaxies in the more luminous bin as the type 1 galaxies 
and the dimmer bin as the type 2 galaxies for each pair of neigh- 
bouring luminosity bins, and repeat this in each volume V1-V6. 
We do this for all galaxies and also for red and blue galaxies sep- 
arately. For the colour dependence, we use red galaxies as type 1 
and blue galaxies as type 2, and again repeat this in each volume. 
To determine the scale dependence, we repeat all of these tests for 
four different values of the cell size L as described in Section[T2l 



4.1.1 Is the bias linear and deterministic? 

The results are plotted in Fig. |6] which shows the minimum value 
of the null-buster test statistic Umin vs. cell size L. According to 
this test, deterministic linear biasing is in fact an excellent fit for 
the luminosity-dependent bias: nearly all Vmin fall within |fmin| < 
2, indicating consistency with the null hypothesis at the 2cr level. 
(There are a few exceptions in the case of the red galaxies, the 
largest being Vmin ~ 5 for the smallest cell size in V3.) For colour- 
dependent bias, however, deterministic linear biasing is ruled out 
quite strongly, especially at smaller scales. 
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cell size (h ^ Mpc) cell size (h ^ Mpc) 



Figure 6. Null-buster results for pairwise compaiisons. i^niin measures the number of sigmas at which detemiinistic linear biasing can be ruled out as a model 
of relative bias between the two samples being compared. Shaded areas indicate li^minl < 2, where data is consistent with the null hypothesis at the 2cr level. 
Four different types of pairwise comparison are illustrated: (a) luminous vs. dim, (b) red vs. blue, (c) luminous red vs. dim red, and (d) luminous blue vs. dim 
blue. The different symbols denote the different comparison volumes V1-V6. The luminosity-dependent bias (a, c, d) is consistent with deterministic hnear 
biasing but colour-dependent bias (b) is not. 



The cases where the null hypothesis survives are quite note- 
worthy, since this implies that essentially all of the large clustering 
signal that is present in the data (and is visually apparent in Fig. (3} 
is common to the two galaxy samples and can be subtracted out. For 
example, for the V5 luminosity split at the highest angular resolu- 
tion (L = 10 /i^^Mpc), clustering signal is detected at 953cr in the 
faint sample {u[f) ~ 953 for f — 0) and at 255cr in the bright sam- 
ple (v{f) ~ 255 for / — oo), yet the weighted difference of the 
two maps is consistent with mere shot noise (i^(0.88) ~ —0.63). 
This also shows that no luminosity-related systematic errors afflict 
the sample selection even at that low level. 



4.1.2 Is the bias independent of scale? 

For the luminosity-dependent bias, we use the value of / that gives 
z^min as a measure of ferci, the relative bias between two neighbour- 
ing luminosity bins. Since deterministic linear bias is ruled out in 
the case of the colour-dependent bias, we instead use the value of 
6rci from the likelihood analysis here. We find that although the 
value of forei depends on luminosity, it does not appear to depend 
strongly on scale, as can be seen in Fig. [7] in all plots the curves 
appear roughly horizontal. To test this 'chi-by-eye' inference of 
scale independence quantitatively, we applied a simple fit on 
the four data points (or three in the colour-dependent case) in each 
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volume using a one-parameter model: a horizontal line with a con- 
stant value of 6roi- For this fit we use covariance matrices derived 
from jackknife resampling, as discussed in Section IbTI We define 
this model to be a good fit if the goodness-of-fit value (the proba- 
bility that a as p oor as as th e value calculated should occur by 
chance, as defined in lPress eTal. 1992) exceeds 0.01. 

We find that this model of no scale dependence is a good fit 
for all data sets plotted in Fig.|7] We therefore find no evidence that 
the luminosity- or colour-dependent bias is scale-dependent on the 
scales we probe here(2 — 160/i~^Mpc). This implies that recent 
cosmological parameter analy ses which use only measurements 
on scales > 60 fe~^Mpc (e.g.. lSanchez et al.ll2006l : lTegmark et al.l 
l200dlSpergel et al.ll2007h are probably justified in assuming scale 

independence of luminosity-dependent bias. 

^In comparison to previous work ( IZehavietaf]|2005l :l Li et al.l 

l2006h . i t is perhaps sur prising to see as little scale dependence as 
we do - iLi et al] ( l2006h find the luminosity-dependent bias to vary 
with scale (see their fig. 4), in contrast to what we find here. The 
measurement of luminosity-dependent bias in Zehavi et al. ( 2005) 
agrees more closely with our observation of scale independence, 
but their their fig. 10 indicates that we might expect to see scale 
dependence of the luminosity-dependent bias in the most luminous 
samples. However, we measure the bias in our most luminous sam- 
ples (in V 6) at 16 — 134 h^^Mpc, well above the range probed in 
IZehavi et al. ( 2005), so there i s no direct confl ict here. Addi tionally, 
fig. 13 of iZe havi et al] j2005l) and fig. 10 of iLi et alj (12006) show 
that correlation functions of red and blue galaxies have significantly 
different slopes, implying that the colour-dependent bias should be 
strongly scale-dependent on 0.1 — 10ft~^Mpc scales. However, 
the points > 1 h^^Mpc in these plots (the range comparable to the 
scales we probe here) do not appear strongly scale dependent, so 
our results are not inconsistent with these correlation function mea- 
surements. This in terpretation is further supported by recent work 
dWang et alj|2007l) that finds correlation functions for different lu- 
minosities and colours to be roughly parallel above ~ 1 h~^Mpc. 



4.1.3 How bias depends on luminosity 

Our next step is to calculate the relative bias parameter 6/6, (the 
bias relative to L, galaxies) as a function of luminosity by com- 
bining the measured values of ferci between the different pairs of 
luminosity bins. This fu nction has been measu red previously us- 
ing SDSS power spectra jTegmark et alj|2004bl) at length scales of 
~ 60/i~^Mpcas well as SDSS feehavi et alJl2005l;lLi et alj2006l : 
IWangetal.| [ 200 7^ and 2dFGRS ('Norber g et aljT200l ') correlation 
functions at length scales of ~ 1 /i^^Mpc - here we measure it at 
length scales of ~ 20 ^"^Mpc. 

The bias of each luminosity bin relative to the central bin L4 
is given by 



6i 

bj_ ^ 

bi bisb^ebai 



1 



bi 
bi 



b-ab. 



'23O34, 



biZibz, 



63 

64 



= &34, 



65 

bi 



1 

b45 ' 



(23) 



where fo^/s denotes the measured value of 6rei between luminos- 
ity bins La and L/3 using all galaxies and ba denotes the bias of 
galaxies in luminosity bin Lq relative to the dark matter. For each 
pairwise comparison, we choose the value of b^^i calculated at the 
resolution where the cell size is closest to 20/i~^Mpc, as illus- 
trated in Fig.|5] (Since we see no evidence for scale dependence of 



ferci for the luminosity-dependent bias, this choice does not strongly 
influence the results.) 

To compute the error bars on 60/64, we rewrite equation ( |23b 
as a linear matrix equation using the logs of the bias values: 
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or Abi, 



(24) 

is a vector of the log of our 



^log = blog.rcl, where 6log,rel 

relative bias measurements bap, biog is a vector of the log of the 
bias values ba/bi, and A is the matrix relating them. We deter- 
mine the covariance matrix Srd of broi (a vector of the relative 
bias measurements bap) from the jackknife resampling described 
in Section IbTI and then compute the covariance matrix Siog,rci of 
biog.roi by 

Slog,rcl = (^Bj?^!^ SrclBj.J (25) 

where Brci = diag (brci). We invert equation ( I24t to give biog: 



^log A ^'logjrcli 

with the covariance matrix for biog given by 



-I log 



^ ^log.rcl'^ 



(26) 



(27) 



W e then fit our data with the model used in iNorberg et al.l 
( 1200 ih : b{M)/b, = ai + ci2 (L/Lf), parameterized by a = 
(oi, 02). Here A4 is the central absolute magnitude of the bin, 
L is the corresponding luminosity, and AI, = —20.83. We use 
a weighted least-squares fit that is linear in the parameters (ai , a2) 
- that is, we solve the matrix equation 



/ 61/64 
62/64 
63/64 
64/64 
65/64 

V 66/64 



L1/L4 

L2/L4 

L3/L4 

L4/L4 
L5/L4 
L(i/L4 



ai 

0.2 



(28) 



or 6 = Xa, where b is a vector of the bias values 60/64 and X is 
the matrix representing our model. We solve for a using 



(29) 



Here S is the covariance matrix of b, given by 
S = BSwB"^, 



(30) 



where Siog is given by equation ( |27| l and B = diag (b). This pro- 
cedure gives us the best-fitting values for the parameters ai and 
02, accounting for the correlations between the data points that are 
induced we compute the bias values b from our relative bias mea- 
surements. We then normalize the model such that 6 (M«) /6» = 1. 

Figure [8] shows a plot of 6/6, vs. M: results for all galax- 
ies are plotted with black open circles, our best-fi tting model is 
shown by the solid line, the best-fitting model from lNorberg et aH 
( l200lh is shown by the grey dashed line,and the best-fitting model 
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Figure 7. Relative bias b^cl between pairwise samples, (a) luminous vs. dim, (b) red vs. blue, (c) luminous red vs. dim red, and (d) luminous blue vs. dim 
blue, revealing no significant scale dependence of luminosity- or colour-dependent bias. The b^el values shown for luminosity dependent splittings (a), (c), and 
(d) were computed with the null-buster analysis, those shown for the colour-dependent splitting (b) were computed with the likelihood analysis. The different 
symbols denote the different comparison volumes V1-V6. 



from iTegmark et all ( l2004bl) is shown by the dotted line. The er- 
ror bars represent the diagonal elements of S from equation i30t . 
Our model, with (ai,a2) = (0.862,0.138), agrees extremely 
well with the model from iNorberg et afl ( 1200 ih . with (ai, 02) = 
(0.85, 0.15). This agreement is quite remarkable since we use data 
from a different survey and analyse it with a completely different 
technique. 

A comparison of our results with previous measurements is 
shown in Fig.|9]in the left panel. In order to compare our SPSS re- 
sults with results from 2dFGRS jNorberg et aLll2002l : I Croton et al.l 
l2007h . we have added a constant factor of —1.13 to their quoted 
values for Mi, , — 5 loRm h in order to line up the value of M* 
used in lNorberg et al.lf2002h (Mf, , -51ogio/i = -19.7) with the 



value used here (Afoi = —20.83). Note that this is necessarily a 
rough correction since the magnitude in the different bands varies 
depending on the spectrum of each galaxy, but this method pro- 
vides a reasonable means of comparing the different results. This 
plot shows excellent agreement over a wide range of scales, lend- 
ing further support to our conclusion that the luminosity-dependent 
bias is independent of scale. 

We also use equation l|26) to calculate b/6, vs. M for red and 
blue galaxies separately. To plot the points for red, blue, and all 
galaxies on the same fe/6, vs. M plot, we need to determine their 
relative normalizations. Applying equation l |26l l to the red and blue 
galaxies gives &a, rod/64, red and &a,biuo/''4,biuG, so to normalize the 
red-galaxy and blue-galaxy data points to the all-galaxy data points 
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Figure 9. Com paris on to previous resu lts for the luminosity dependence of bias for all, red, and blue galaxies. iNorberg et alj ^2002); Zehavi et al.' ('2005j); 
iLi et alJ <200el) , and lWang et al] ilOOH) use correlation function measurements, Tegmark et_al. (2004b) use the power spectrum, and Croton et al. (2007) use 
counts in cells. To better illustrate the similarities and differences in the trends as a function of luminosity, we have normalized all measurements to match 
our results using the bin closest to M« = —20.83. The error bars shown are all relative: they do not include uncertainties du e to the normalization. Num bers 
in parentheses denote the scale in /i~^Mpc at whic h the measurem ents were done. Also shown are theoretical models from lvan den Bosch etaP j2003l) (we 
show their model B as a representative example) and lCooravl ^20051) - these are also normalized to match our results at M» . 
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Figure 8. Luminosity dependence of bias for all (open circles), red (solid 
triangles), and blue (solid squares) galaxies at a cell size of ~ 20 /i~^Mpc 
from null-buster results. The solid line is a model fit to the al l-galaxy data 
points, the dotted line shows the model from Tegmark et al. (2004b), and 
the grey dashed Une shows the model from Norberg et al. (2001). The 
iNorberg et al.l i200lh model has been computed using the SDSS r-band 
value of M, = -20.83. 



in Fig. [8] we need to calculate 

&a,rod _ fe4,all 64, rod ^a.red 



&*,all 64, all &4,i 



(31) 



^'a.bluc _ fe4,all &4,rcd 64, blue ba.bluo 



6, 



,all 



.,all 64, all 04,rcd 04,bluc 



(32) 



The factor 64,aii/&*.aii is simply the normalization factor cho- 
sen for the above model to give 6(i\f,) /6, — 1. To determine 
fe4.rod/&4,aii, wc usc best-fittlng valucs of af from the likelihood 
analysis described in Section [331 at the resolution with cell sizes 
closest to 20 /i^^Mpc: ai from the comparison of dimmer and 
more luminous galaxies in V3 gives &4,aii, and similarly ai from 
the comparison of blue and red galaxies in L4 gives 64, red, so 



1 ,rcd vs. blue L4 



l.lumvs. dim V3 



1/2 



(33) 



and 



The blue points are then normalized relative to the red points using 
fe4,biuc/64,rod cqual to the measured value of fcrci from the likeli- 
hood comparison of blue and red galaxies in L4. Thus the shapes 
of the red and blue curves are determined using the luminosity- 
dependent bias from the null-buster analysis, but their normaliza- 
tion uses information from the likelihood analysis as well. 

Splitting the luminosity dependence of the bias by colour 
reveals some interesting features. The bias of the blue galaxies 
shows only a weak dependence on luminosity, and both luminous 
(M ~ —22) and dim (M — 17) red galaxies have slightly higher 
bias than moderately bright (AI ~ —20 ^ AI,) red galaxies. The 
previously observed luminosity dependence of bias, with a weak 
dependence dimmer than L^, and a strong increase above L«, is 
thus quite sensitive to the colour selection: the lower luminosity 
bins contain mostly blue galaxies and thus show weak luminosity 
dependence, whereas the more luminous bins are dominated by red 
galaxies which drive the observed trend of more luminous galax- 
ies being more strongly biased. It is instructive to compare these 
results with the mean local overdensity in colour-magnitude space, 
as in fig. 2 of Blanton et al. (2005a). Although our bias measure- 
ments are necessarily much coarser, it can be seen that the bias is 
strong est where the overden sity is largest, as has been seen previ- 
ously jAbbas & Shetlj2006l) . 
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Comparisons of our results to other measurements of 
luminosity-dependent bias for red and blue galaxies are shown in 
Fig.|9]in the middle and right panels. Indications of the differing 
trends for red and blue galaxies have been observed in previous 
work: an ea rly hint of the upturn in the bias for dim red galaxies 
was seen m iNorberg et all j2002l) . and recent results jWang et aTl 
l2007h also indicate higher bias for dim red galaxies at scales 
> 1 /i~^Mpc. Ho wever, there is some i nco nsistency between these 
results compared to lZehavi et alj ( |2005[) and lLi et aU 1 2006) regard- 
ing the dim red galaxies: they find that dim red galaxies exhibit 
the strongest clustering on scales < 1 ft^^Mpc and luminous red 
galaxies exhibit the strongest clustering on larger scales, as can be 
seen from the gree n points in Fig. [9] This is shown in fig. 14 of 
IZehavi et all ^20051) and fig. 11 of Ei et alj |2006). However, we 
find the dim red galaxies to have higher bias than L, red galax- 
ies at all the scales we probe (2 — 40 /i^^Mpc in this case). This 
upturn of the bias for dim re d galaxies is present in the ha lo model- 
based theoretical curves from van den Bosch et al. '('2003*), although 
not in the theqretical ciirves from Cooray (2005). Note also that 
Ivan den Bosch et al] ( l2003h use the data from lNorberg et al.l ( |2002|) 
to constrain their models so the agreement between the theory and 
data should be interpreted with some caution. 

Recent measurements of higher order clustering statistics 

jCrot on et al. 2007) find the same trends in the clustering strengths 
of red and blue galaxies, although they indicate that their linear bias 
measurement (which should be comparable to ours) shows the op- 
posite trends - little luminosity dependence for red galaxies and a 
slight monotonic increase for blue galaxies. However, their lumi- 
nosity range is much narrower than ours so the trends are less clear, 
and placing their data points on Fig. |9] shows that they are in good 
agreement with our results^ 

Previous studies jNorberg et al] |2002| : iLi et al.l l2006l : 
IWang et al.l 120071) have also reported a somewhat stronger lu- 
minosity dependence of blue galaxy clus tering than we have 
measured here. As can be seen in Fig. m iNorberg et all ( l2002h 
and IWang et al.l ( I2q07h measure s lightly higher bias for luminous 
blue galaxies, and i Li et al.ll2006l) measure slightly lower bias for 
dim blue galaxies. Although the quantitative disagreement is fairly 
small, the qualitative trends of the previous studies imply that the 
bias of blue galaxies increases with luminosity, as opposed to our 
measurement which indicates a lack of luminosity dependence. 



4.2 Likelihood Results 

To study the luminosity dependence, colour dependence and 
stochasticity of bias in more detail, we also apply the maximum 
likelihood method described in Section [375l to all of the same pairs 
of samples used in the null-buster test. Due to constraints on com- 
puting power and memory, we perform these calculations for only 
three values of the cell size L rather than four, dropping the highest 
resolution (smallest cell size) shown in Fig.|4] The likelihood anal- 
ysis makes a few additional assumptions, but provides a valuable 
cross-check and also a measurement of the parameter rrci which 
encodes the stochasticity and non-linearity of the relative bias. 

For each pair of samples, the likelihood function given in 
equation M2\ is maximized with respect to the parameters a'f, 
&rei, and r^ei and marginalized over (j\ to determine the best- 
fitting values of 6rei and rrd, with uncertainties defined by the 
A (2 In £) = 1 contour in the &rci-f roi plane. As we discuss in Sec- 
tion IB2.3I the values of b^^i found here are consistent with those 
determined using the null-buster test. 

Figure [To] shows the best-fitting values of r-rd as a function 



of cell size L. For the comparisons between neighbouring lumi- 
nosity bins, the results are consistent with rrci = 1. On the other 
hand, the comparisons between red and blue galaxies give rroi < 1, 
with smaller cell sizes L giving smaller values of rrei- This con- 
firms the null-buster result that the luminosity-dependent bias can 
be accurately modelled using simple deterministic linear bias but 
colour-dependent bias demands a more complicated model. Also, 
Trci for the colour-dependent bias is seen to depend on scale but not 
strongly on luminosity. In contrast, 6rci (both in the null-buster and 
likelihood analyses) depends on luminosity but not on scale. 

To summarize, we find that the simple, deterministic model 
is a good fit for the luminosity-dependent bias, but the colour- 
dependent bias shows evidence for stochasticity and/or non- 
linearity which increases in strength towards smaller scales. These 
results are consistent with previous detections of stochasticity/non- 
lineari ty in spectral - type-dependent bias jTegmark & BromlevI 
199 ^: iBlaritorjbOOd : IConwav et akl l2005h , and also agree with 
(Wil d etal.l2005h" which measures significant stochasticity between 
galaxies of different colour or spectral type, but not between galax- 
ies of different luminosities. 

We compare our results for rroi for red and blue galaxies to 
previous results in Fig. II 1 | This shows g ood agreement between 
our results and those of Wild et aU ( 12005 ) (run from their fig. 11), 
implying that these results are quite robust since our analysis uses a 
different data set, employs different methods, and makes different 
assumptions. 

For the results from cross-cor relation mea s ureme nts, how- 
ever, the agreement is not as clear. IZehavi et alj ( l2005h find that 
the cross-correlation between red and blue galaxies (their fig. 24), 
indicates that rrci is consistent with 1 on scales > 
However, it is not clear that this result disagrees with ours, as 
their result is for luminous galaxies (Mo.i,, < —21) and and we 
do not see a strong indication of rid < 1 for our V6 sample 
(23 < Mo.i^ < -21). 

More recent cross-correlation measurements dWang et"ail 

|2007|) do find evidence for stochasticity/non-linearity between red 
and blue galaxies at scales < 1 /i^^Mpc and also show an indica- 
tion that dimmer galaxies have slightly lower values of rrci. Note 
also that the method of calculating rrei b y taking ratios of c ross- 
and auto-correlation functions as used for lZehavi et ai]( l2005l) and 
IWang et al] ( |2007|) does not automatically incorporate the con- 
straint that I rrei I ^ 1 as our analysis does, so their error bars are 
allowed to extend above rrd = 1 in Fig.llll 

Overa ll, the counts-in-cells measurements (this paper, 

IWild et alj 1200^ show stronger evidence for stochasticity/non- 
linearity at larger scales than the cross-correlation measurements 
dZehavi et al. 2005; Wang et al. 2007), indicating either that there 
might be some slight systematic variation between the two methods 
or that the counts-in-cells method is more sensitive to these effects. 



5 CONCLUSIONS 

To shed further light on how galaxies trace matter, we have quan- 
tified how different types of galaxies trace each other. We have 
analysed the relative bias between pairs of volume-limited galaxy 
samples of different luminosities and colours using counts-in-cells 
at varying length scales. This method is most sensitive to length 
scales between those probed by correlation function and power 
spectrum methods, and makes point-by-point comparisons of the 
density fields rather than using ratios of moments, thereby elim- 
inating sample variance and obtaining a local rather than global 
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Figure 10. The best-fitting values of the relative cross-correlation coefficient r^ei between pairwise samples. Four different types of pairwise comparison are 
illustrated: (a) luminous vs. dim, (b) red vs. blue, (c) luminous red vs. dim red, and (d) luminous blue vs. dim blue. The different symbols denote the different 
comparison volumes V1-V6. 



measure of the bias. We applied a null-buster test on each pair of 
subsamples to determine if the relative bias was consistent with 
deterministic linear biasing, and we also performed a maximum- 
likelihood analysis to find the best-fitting parameters for a simple 
stochastic biasing model. 

5.1 Biasing results 

Our primary results are: 

(i) The luminosity-dependent bias for red galaxies is signifi- 
cantly different from that of blue galaxies: the bias of blue galaxies 
shows only a weak dependence on luminosity, whereas both lumi- 
nous and dim red galaxies have higher bias than moderately bright 
(Z/t) red galaxies. 



(ii) Both of our analysis methods indicate that the siinple, deter- 
ministic model is a good fit for the luminosity-dependent bias, but 
that the colour-dependent bias is more compUcated, showing strong 
evidence for stochasticity and/or non-linearity on scales < 10/i~^ 
Mpc. 

(iii) The luminosity-dependent bias is consistent with be- 
ing scale-independent over the range of scales probed here 
(2 — 160 /i~^Mpc). The colour-dependent bias depends on lumi- 
nosity but not on scale, while the cross-correlation coefficient rroi 
depends on scale but not strongly on luminosity, giving smaller rrei 
values at smaller scales. 

These results are encouraging from the perspective of using 
galaxy clustering to measure cosmological parameters: simple 
scale-independent linear biasing appears to be a good approxima- 
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Figure 11. Comparison of relative cross-correlation coefficient be- 
tween red and blue galaxies as measured with different techniques. The 
points from lZehavi et alj )2005l) are extracted from cross-correlation mea- 
sur ements between re d and blue galaxies in SDSS with Mo.i^ < —21, and 
the Iwild et alj j2005l) points are fro m a counts-in-cells analysis using all 
2dFGRS galaxies. Our results and the lWangetanj2007h results (also from 
SDSS) are separated by luminosity - symbols are the same as in Fig. 1101 
and for the Wang et al. 1 2007) results, open triangles denote their dimmest 
bin (—19 < Moi.,, < —18) and open squares denote their most lumi- 
nous b in (—23 < A/oi.^ < —21.5). The length scales used in lWang et alj 
i2007l) are averages over small scales (0.16 — 0.98/i^^Mpc) and large 
scales (0.98 — 9.8/i~^Mpc)- points here are shown in the middle of these 
ranges and offset for clarity. 

tion on the > 60 fe ~^Mpc scales used in many recent cosmo- 
logical studie s (e.g.JSanchez et al.l ( l2006l) :lTtegmark et al. (20061); 
ISpergel et al.l J2007l) ). However, further quantification of small 
residual effects will be needed to do full justice to the precision 
of next-generation data sets on the horizon. Moreover, our results 
regarding colour sensitivity suggest that more detailed bias stud- 
ies are worthwhile for luminous red galaxies, which have emerged 
a powerful cosmological probe because of their visibility at large 
distances and near-optimal number density ( lEisenstein et alj|200ll . 
l2005l : lTegmark et al.ll2006h . since colour cuts are involved in their 
selection. 

5.2 Implications for galaxy formation 

What can these results tell us about galaxy for mation in the contex t 
of the halo model? First of all, as discussed in IZehavi et alj bOOSi) . 
the large bias of the faint red galaxies can be explained by the fact 
that such galaxies tend to be satellites in high mass haloes, which 
are more strongly clustered than low mass haloes. Previous studies 
have found that central galaxies in low-mass haloes are preferen- 
tially blue, central galaxies in high mass haloes tend to be red, and 
that the luminosity of the central galaxy is strongly correlated with 
the halo mass (Yang et al. 2005; Zheng et al. 2005). Our observed 
lack of luminosity dependence of the bias for blue galaxies would 
then be a reflection of the correlation between luminosity and halo 



mass being weaker for blue galaxies than for red ones. Additional 
work is needed to study this quantitatively and compare it with the- 
oretical predictions from galaxy formation models. 

The detection of stochasticity between red and blue galaxies 
may imply that red and blue galaxies te nd to live in different ha loes 
- a study of galaxy groups in SDSS ( iWeinmann et al.|[2006h re- 
cently presented evidence supporting this, but this i s at od ds with 
the cross-correlation measurement in IZehavi et al. I l l2005h . which 
implies that blue and red galaxies are well-mixed within haloes. 
The fact that the stochasticity is strongest at small scales suggests 
that this effect is due to the 1-halo term, i.e., arising from pairs of 
galaxies in the same halo, although some amount of stochasticity 
persists even for large scales. However, the halo model implications 
for stochasticity have not been well-studied to date. 

In summary, our results on galaxy biasing and future work 
along these lines should be able to deepen our understanding 
of both cosmology (by quantifying systematic uncertainties) and 
galaxy formation. 
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APPENDIX A: CONSISTENCY CHECKS 
Al Alternate null-buster analyses 

In order to test the robustness of our results against various sys- 
tematic effects, we have repeated the null-buster analysis with four 
different modifications: splitting the galaxy samples randomly, off- 
setting the pixel positions, using galaxy positions without applying 



the finger-of-god compression algorithm, and ignoring the cosmo- 
logical correlations between neighbouring cells. 

Al.l Randomly split samples 

The null-buster test assumes that the Poissonian shot noise for each 
type of galaxy in each pixel is uncorrelated - i.e., that the matrix N 
in equation ^ is diagonal - and that this shot noise can be approx- 
imated as Gaussian. To test the impact of these assumptions on our 
results, we repeated the null-buster analysis using randomly split 
galaxy samples rather than splitting by luminosity or colour. Ror 
each volume V1-V6, we created two samples by generating a uni- 
formly distributed random number for each galaxy and assigning it 
to sample 1 for numbers > 0.5 and sample 2 otherwise. 

If the null-buster test is accurate, we expect the pairwise com- 
parison for the randomly split samples to be consistent with deter- 
ministic linear bias with 6rci = 1. The results are shown in Rig. lAll 
- we find that, indeed, deterministic linear bias is not ruled out, 
with nearly all of the i^min points falling within ±2. Rurthermore, 
the measured values of ferci are seen to be consistent with 1 . Thus, 
we detect no systematic effects due to the null-buster assumptions. 



Al.l Ojfset pixel positions 

To test if our results are stable against the pixelization chosen, par- 
ticularly at large scales where we have a small number of cells, 
we shifted the locations on the sky of the angular pixels defining 
the cells by half a pixel width in declination. Applying the null- 
buster analysis to the offset pixels reveals no significant differences 
from the original analysis: the luminosity-dependent comparisons 
for all, red, and blue galaxies are still consistent with deterministic 
linear bias, and the colour-dependent comparison still shows strong 
evidence for stochasticity and/or nonlinearity, especially at smaller 
scales. 

For the all-, red-, and blue-galaxy, luminosity-dependent com- 
parisons, we also compared the measured values of ferci from the 
offset analysis with the original analysis: we took the difference be- 
tween the two measured values in each volume at each resolution 
and divided this by the larger of the error bars on the two analyses 
to determine the number of sigmas by which the two analyses dif- 
fer. In order to be conservative, we did not add the error bars from 
the two analyses in quadrature, since this would overestimate the 
error on the difference if they are correlated and this would make 
our test less robust. Note that a fully proper treatment would neces- 
sitate accounting for the correlations between the errors from each 
analysis, which we have not done - the discussions in the this and 
the following sections are meant only to serve as a crude reality 
check. 

For the error bars on the original analysis, we used the jack- 
knife uncertainties described in Section lBTl and for the offset anal- 
ysis error bars we use the generalized uncertainties described 
in Section [J!4l computed from the offset results. (This is because 
we did not perform jackknife resampling for the offset case or the 
other modified analyses.) 

The results show good agreement: out of a total of 72 mea- 
sured 6rci values, only 4 differ by more than 2a (all galaxies in V3 
and V5 at the second-smallest cell size, at 2.6(7 and 3.2(7 respec- 
tively, and the red galaxies in V3 and V5 at the second-smallest cell 
size, at 2.2(7and 3.3(7 respectively). As a rough test for systematic 
trends, we also counted the number of measurements for which the 
measured value of 6rci is larger in each analysis. We found that in 
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Figure Al. Null-buster results for randomly split samples. 
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38 cases the value from the offset analysis was larger, and in 34 
cases the value from the original analysis was larger, indicating no 
systematic trends in the deviations. 



A1.3 No finger-of-god compression 

Our analysis u s ed the finger-of-god compression algorithm from 
iTegmark et Zl ( l2004bl) with a thi'eshold density of 5c = 200. 
This gives a first-order correction for redshift space distortions, but 
it complicates compar ison s to othe r an alyses which work purely 
in redshift space ( e.g. Iwil d et al. 2Q05|) or use projected correla- 
tion functions (e.g. lZehavi et al . 2005), particularly at small scales 
(< 10/i^^Mpc) where the effects of virialized galaxy clusters 
could be significant. To test the sensitivity of our results to this cor- 
rection, we repeated the null-buster analysis with no finger-of-god 
compression. 

The results show excellent agreement with the original anal- 
ysis - the smallest-scale measurements for the colour-dependent 
comparison only rule out deterministic linear bias at 30 sigma 
rather than 40, but the conclusions remain the same. Additionally, 
we compared the measured brci values to the original analysis as in 
Section IA1.2l and find all 72 measurements to be within 2cr. In 25 
cases, the analysis without finger-of-god compression gave a larger 
fcrei value, and in 47 cases the original analysis gave a larger value. 
This indicates that there might be a very slight tendency to underes- 
timate ferci if fingers-of-god are not accounted for, but the effect is 
quite small and well within our error bars. Thus, the finger-of-god 
compression has no substantial impact on our results. 



A1.4 Uncorrelated signal matrix 

The null-buster test requires a choice of residual signal matrix Sa 
- our analysis uses a signal matrix derived from the matter power 
spectrum, thus accounting for cosmological correlations between 
neighbouring cells. However, these correlations are com monly as- 
sumed to be negligible in other counts-in-cells analyses JBlantonl 



l2000l : IWild etalll 20051 : IConwav et al.ll2005t) . To test the sensitivity 
to the choice of Sa , we repeated the analysis using Sa equal to the 
identity matrix. 

Again, we find the results to agree well with the original anal- 
ysis and lead to the same conclusions. When comparing the mea- 
sured brci values to the original analysis as in Section IA1.2I we find 
only 4 out of 72 points differing by more than 2a (all galaxies in V5 
at the smallest cell size, at —2.6a", red galaxies in V4 at the second- 
smallest and smallest cell size, at 2.2cr and 2.8(t respectively, and 
blue galaxies in V4 at the smallest cell size, at — 3.1(t). In 39 cases 
the value of 6rci is larger with the uncorrelated signal matrix, and in 
33 cases 6rci is larger in the original analysis, indicating no strong 
systematic effects. Thus we expect our results to be directly com- 
parable to other counts-in-cells analyses done without accounting 
for cosmological correlations. 



APPENDIX B: UNCERTAINTY CALCULATIONS 

Bl Jackknife uncertainties for null-buster analysis 

We use jackknife resampling to calculate the uncertainties for the 
null-buster analysis. The concept is as follows: divide area covered 
on the sky into A'^ spatially contiguous regions, and then repeat the 
analysis A'^ times, omitting each of the A'^ regions in turn. The co- 
variance matrix for the measured parameters is then estimated by 
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N - 1 
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where superscripts i and j denote measurements of 6rci in different 
volumes and at different scales, b].^i f. denotes the value of bl^i with 
the fcth jackknife region omitted, and bl^j is the average over all TV 
values of bl^i f, . 

For our analysis, we use the 15 pixels at our lowest resolu- 
tion (upper left panel in Fig.Q as the jackknife regions. However, 
since we use a looser completeness cut at the lowest resolution, two 
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Figure Bl. Comparison of two methods for calculating uncertainties on fci-cl from the null-buster analysis: jackknife resampHng (red) and the generalized 
method (black). Also shown are the results for ferel from the likelihood analysis (grey). 



of these pixels cover an area that is not used at higher resolutions. 
Thus we chose not to include these two pixels in our jackknifes 
since they do not omit much (or any) area at the higher resolutions. 
Thus our jackknife resampling has = 13. This technique allows 
us to estimate the uncertainties on all of our 6rci measurements as 
well as the covariance matrix quantifying the correlations between 
them. We use these covariances in the model-fitting done in Sec- 
tions |4T2] and |4T3] 

Figure IbTI shows the uncertainties on brci as calculated from 
jackknife resampling compared to those calculated with the gener- 
alized method described in Section [J!4| Overall, the two meth- 
ods agree well, but the jackknife method gives larger uncertainties 
at the smallest scales and in volume VI. The reason for the large 
jackknife uncertainties in volume VI is because it is significantly 
smaller than the other volumes, and it is small enough that omit- 



ting a cell containing just one large cluster can have a substantial 
effect on the measured value of brci- Thus the large uncertainties in 
VI reflect the effects of sampling a small volume. Since there are 
so few dim red galaxies, these effects are particularly egregious for 
the measurements of luminosity-dependent bias of red galaxies in 
VI. Thus, based on the jackknife results, we elected to not use VI 
in our analysis of the red galaxies. 

B2 Likelihood uncertainties 

B2. 1 Likelihood contours 

As described in Section [33] we calculate the uncertainty on brci 
and Trei for the likelihood method using the A (21n/I) = 1 con- 
tour in the &roi-''rci plane after marginalizing over a'f. This means 
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Figure B2. Typical contour plots of A (2 In C) for volume V5 for three different resolutions corresponding (from left- to right-hand-side) to cell sizes of 89, 
39, and 21 h^^Mpc. Blue contours denote the 1, 2 and 3cr two-dimensional confidence regions, and black contours denote the Icr one-dimensional confidence 
region used for computing error bars on 6^0! ™d rVci ■ The red contours denote the error ellipse calculated from the second-order approximation to 2 In £ at 
the best-fit point, marked with a X . 



that for each comparison volume and at each resolution, we calcu- 
late £ from equation i22\ over a grid of bj-d and r^-d values and 
maximize 2 In £ with respect to af at each grid point. This gives 
us a 2-dimensional likelihood function, which we then maximize to 
find the best-fit values for 6rci and r^^i- Uncertainties are calculated 
using the function 

A (2 In £ (b.ei r.oi)) = 2 In £ {KTr^D - 2 In £ (fe^d r.d) . 

(B2) 

Typical contour plots of this function for volume V5 at each of the 
three cell sizes used are shown in Fig. lB2l 

We define 1- and 2-dimensional confidence region s using 
the standard procedures detailed in IPress et al] ( Il992h . using 
A (2 In £) as an equivalent to A^ : the la (68.3%) 1-dimensional 
confidence region is given by A (2 In £) — 1, so we define our er- 
ror bars on ferci and r^d by projecting the A (2 In £) = 1 contour 
(shown in black in Fig. lB2t onto the b^d and r^oi axes. For illustra- 
tive purposes we also show the la, 2a, and 3a (68.3%, 95.4%, and 
99.73%) 2-dimensional confidence regions in these plots, given by 
A (21n£) = 2.30, 6.17, and 11.8 respectively. 

To check the goodness of fit, we also compute an effective 
value of x^- 



2 

Xoff 



-21ri£-ln|C| -2nln(27r) 



B2.2 brci-J^rci covariance matrices 

Alternatively, we can calculate the uncertainties using the param- 
eter covariance matrix at the best fit parameter values, as is com- 
monly done in analyses. The Hessian matrix of second deriva- 
tives is given by 



d''(2 1n£) d-'(21nC) 



d?(2lnC) d^(21ii£) 



(B4) 



cl 



and the parameter covariance matrix is given by 



2H 



(B5) 



(B3) 



Thus the uncertainties are given by (t^^^j and a,,^^^ with this method. 
This is equivalent to approximating the likelihood function 2 In £ 
with its second-order Taylor series about the best-fit point, and it 
defines an error ellipse that approximates the A (2 In £) = 1 con- 
tour. These error ellipses are shown in Fig. lB2l in red, and are seen 
to be in close agreement with the true A (2 In £) = 1 contours. 

This method also allows us to measure the correlation between 
forei and rrei by calculating the correlation coefficient, given by 



where C is given by equation l[8j and n is the number of cells. If 
our model is a good fit, the value of Xcff at the best fit parameter 
values should be close to the number of degrees of freedom, given 
by dof = 2n — 2 (2n data points for type 1 and 2 galaxies in each 
cell minus 2 parameters ford and r-^d)- We calculated Xeff/dof for 
each volume and resolution and found they all lie quite close to 1, 
ranging from a minimum value of 0.678 to a maximum value of 
1.11. Thus this test indicates our model is a good fit. 

The uncertainties on b^d and r^d could perhaps be calculated 
more accurately using jackknife resampling as we did for the null- 
buster case; however, repeating the analysis for each jackknife sam- 
ple is computationally prohibitive since performing all the calcula- 
tions for just one likelihood analysis took several months of CPU 
time. 



7?; 



:l''rcl 



1/2- 



(B6) 



R will fall between -1 (perfectly anti-correlated) and 1 (perfectly 
correlated). Effectively this measures the tilt of the error ellipse 
in the b^d-Trd plane. Overall we find the values of R to be quite 
small - typically \R\ ~ 0.05 - indicating no large correlations 
between b^d and r^d- Out of the 72 points we calculate, only 6 
have > 0.3. The cases with the largest R values are for blue 
galaxies in volume V6, where the uncertainties are quite large due 
to the small number of bright blue galaxies and the error ellipses 
are not good approximations to the likelihood contours anyway - 
thus these few cases with large R are not overly concerning. 
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B2.3 Comparison with null-buster results 

Finally, we compare the results for ferci from the likelihood method 
to the results from the null-buster analysis in Fig. IBll with the 
likelihood points shown in grey. As can be seen in this plot, the 
likelihood and null-buster values for 6rci agree within the uncer- 
tainties, even for the colour-dependent bias where the null-buster 
values are not necessarily accurate since deterministic linear bias 
is ruled out. Thus our two analysis methods are in excellent agree- 
ment with each other. 
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